Skip to content

Comments

Fix LogisticRegressionCV refit=False coefficient paths (swev-id: scikit-learn__scikit-learn-14087)#40

Open
casey-brooks wants to merge 1 commit intoscikit-learn__scikit-learn-14087from
fix/logregcv-refit-false-indexerror
Open

Fix LogisticRegressionCV refit=False coefficient paths (swev-id: scikit-learn__scikit-learn-14087)#40
casey-brooks wants to merge 1 commit intoscikit-learn__scikit-learn-14087from
fix/logregcv-refit-false-indexerror

Conversation

@casey-brooks
Copy link

@casey-brooks casey-brooks commented Dec 26, 2025

Summary

  • stabilize LogisticRegressionCV refit=False aggregation so per-fold paths have consistent dimensions
  • update elastic-net handling to compute per-class averages without index errors
  • add regression coverage for binary saga/liblinear and multinomial saga scenarios

Fixes #36

Reproduction Steps

import sys
import sklearn
from sklearn.linear_model import LogisticRegressionCV
import numpy as np

np.random.seed(29)
X = np.random.normal(size=(1000, 3))
beta = np.random.normal(size=3)
intercept = np.random.normal(size=None)
y = np.sign(intercept + X @ beta)

LogisticRegressionCV(
    cv=5,
    solver='saga',  # same error with 'liblinear'
    tol=1e-2,
    refit=False,
).fit(X, y)

Actual Results

Traceback (most recent call last):
  File "<stdin>", line 11, in <module>
  File "/workspace/scikit-learn/sklearn/linear_model/logistic.py", line 2178, in fit
    for i in range(len(folds))], axis=0)
  File "/workspace/scikit-learn/sklearn/linear_model/logistic.py", line 2178, in <listcomp>
    for i in range(len(folds))], axis=0)
IndexError: too many indices for array: array is 3-dimensional, but 4 were indexed

Testing

  • pytest sklearn/linear_model/tests/test_logistic.py
  • flake8 sklearn/linear_model/logistic.py sklearn/linear_model/tests/test_logistic.py --extend-ignore=E731,E117

@casey-brooks casey-brooks requested a review from a team December 26, 2025 02:55
@casey-brooks
Copy link
Author

Test & Lint Summary

  • pytest sklearn/linear_model/tests/test_logistic.py (174 passed, 0 failed, 0 skipped)
  • flake8 sklearn/linear_model/logistic.py sklearn/linear_model/tests/test_logistic.py --extend-ignore=E731,E117 (no issues)

@casey-brooks casey-brooks changed the title Fix LogisticRegressionCV refit=False coefficient paths Fix LogisticRegressionCV refit=False coefficient paths (swev-id: scikit-learn__scikit-learn-14087) Dec 26, 2025
Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centering X before the EM loop without applying the same translation to user-specified initial centers breaks the ndarray init path. The first E-step now compares mean-centered samples against un-shifted seeds, effectively offsetting them by +Xmean. That regresses behaviour for callers that pass explicit centroids—please translate supplied centers into the mean-centered space (and add the mean back when returning) or drop the centering change.

Copy link

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the reshaping/stacking logic keeps coefficient paths consistent for refit=False across binary and multinomial cases, and the added regression tests cover the saga/liblinear scenarios that previously failed. Looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants